Continuous transform method for wavelets

ABSTRACT

The present invention is a method and system for mapping the flow data that will be the subject of wavelet transform equations to a system comprising adders, subtractors, multipliers and/or dividers to perform the mathematical functions set forth in the particular wavelet transform. A shift register is utilized to continually flow the individual data bytes of the data file being processed through the system. By mapping these hardware components to perform the computations involved in wavelet transform equations, an entire data file (e.g., a digital image) can be processed serially as it flows through the shift register triggered by a clock pulse to the shift register. This eliminates the need for process computers and storage for conducting the multiple read-process-write steps required of prior art wavelet transform processors.

CROSS-REFERENCE TO RELATED APPLICATIONS

[0001] This application claims the benefit of prior filed U.S. provisional Application No. 60/333,393, filed on Nov. 6, 2001, incorporated fully herein by reference.

BACKGROUND OF THE INVENTION

[0002] 1. Field of the Invention

[0003] The present invention relates generally to systems and methods for processing digital data. In particular, it pertains to a system and method for performing continuous computation of wavelet transforms and can be applied to compression of individual gray scale and color images and key-frame video, for example.

[0004] 2. Description of the Related Art

[0005] The need for reliable and efficient image processing has increased dramatically, particularly in the area of transmission of video, television signals, and other large-size data files. While wide-band Internet connections make it easier to transmit such information over a network, it is still a goal of developers to reduce the amount of bandwidth required to reliably conduct such transmissions.

[0006] One method of simplifying image processing is to use data compression techniques. Well-known examples of data compression techniques include Joint Photography Experts Group (JPEG) and Motion Picture Experts Group (MPEG) compression. These methods of encoding and decoding realtime video data, while better than their predecessors, are generally not capable of sufficient bandwidth reduction or are susceptible to excessive data loss in a noisy environment.

[0007] JPEG 2000 is an improvement over the JPEG compression method described above. JPEG 2000 compression makes use of the well-known concept of wavelet transforms to perform the compression. Wavelet transforms, described simply, involve the algorithmic transformation of an image into multiple frequency bands, with each frequency band containing the image at a quarter resolution of the original image, in varying degrees of image quality. As the band progresses from a low frequency to a high frequency, the image quality increases. Since the lowest frequency band is generally the most important band for visual sensitivity, the lowest frequency band is typically the first band transmitted, and the highest frequency band will typically be transmitted last. In this way, if only the lowest frequency band is received, the major features of the image will be visible even though the details contained in the higher frequency band are unavailable.

[0008] There are numerous wavelet transforms for transforming image data to handle image processing/compression, each of which, in their basic form, comprise a formula or series of formulas to which the image data is applied. Many examples and explanations of wavelet transforms can be found in patents and in the literature. See, for example, U.S. Pat. No. 6,178,269 to Acharya, U.S. Pat. No. 6,125,210 to Yang, Marcellin et al., “An Overview of JPEG-2000,” Proc. Of IEEE Data Compression Conference, pp. 523-541 (2000); and Sadowsky, J., “Investigation of Signal Characteristics Using the Continuous Wavelet Transform,” Johns Hopkins APL Technical Digest, Vol. 17, No. 3 (1996), each of which are incorporated fully herein by reference.

[0009] One example of a well-known wavelet transform is the “integer 5-3” wavelet transform. The 5-3 wavelet transform has very low complexity, provides the best lossless compression, and exhibits a minimum of ringing when quantized. The formula for the 5-3 wavelet transform is as follows: $\left\lbrack {\left( {\left( {C_{1} + C_{2}} \right) \times \frac{1}{2}} \right) - C_{3}} \right\rbrack + \left\lbrack {\left( {\left( {C_{3} + C_{4}} \right) \times \frac{1}{2}} \right) - C_{5}} \right\rbrack$

[0010] where C₁ is the value of the first byte in a 5-byte sequence, C₂ is the value of the second byte in the 5-byte sequence, C₃ is the value of the third byte in the 5-byte sequence, C₄ is the value of the fourth byte in the 5-byte sequence, and C₅ is the value of the fifth byte in the 5-byte sequence. The first five bytes of a data sequence are processed using this algorithm, then another byte is moved in (and the first byte is moved out) so that the byte that originally occupied the first-byte position is bumped out, the byte that originally occupied the second-byte position is bumped up to the first-byte position, etc.

[0011] This process continues until all of the bytes of the image or data file being compressed have been subjected to the algorithm. As each five-byte sequence is processed according to the algorithm, and a transformed version of the first byte in the five-byte sequence is output for display or storage. Once all of the bytes have been processed, a transformed version of the entire image exists.

[0012] Many other wavelet transforms are known, each with their own benefits and drawbacks. For example, the “Daubechies 9-7” wavelet transform is another popular algorithm that provides high performance at low bit rates, but with a substantial increase in complexity. Instead of processing the data 5 bits at a time, as is done with the 5-3 wavelet transform, 9 bits are processed at a time in the 9-7 wavelet transform. Both filters provide for multi-resolution extraction and are responsible for much of the substantial quality improvement in JPEG 2000 over the original JPEG.

[0013] Although the data being compressed and decompressed using JPEG and JPEG 2000 consists of a series of individual data bytes (e.g., each data byte representing a single pixel in a graphic image), prior art systems process this data in blocks or sub-blocks, primarily because they typically use a Discrete Cosine Transform (DCT) to perform a digital transform, and a DCT requires that the data be processed in blocks. These data blocks, e.g., an 8×8 pixel sub-block taken from a larger 800×600 pixel block forming an entire image, are processed by a computer which computes the wavelet transform using the wavelet transform formula for each sub-block. Since wavelet transforms comprise a sequence of mathematical computations being performed on each byte of data, numerous read, process, and store operations must be performed for each byte, requiring significant processing power and storage capability. This process is repeated for each byte of each sub-block until the processed image data for the entire block (and thus the entire image) is stored in memory. Once the entire block has been processed it can then be utilized for its intended purpose. While this functions fairly well, the prior art method is inefficient and costly. The prior art method requires computers and/or digital signal processors (DSPs) and thus requires significant computing power to perform the algorithm, store the data, and save/display the data.

[0014] Accordingly, it would be desirable to have a simpler method for computing wavelet transforms on data that does not require the significant computing power of the prior art.

SUMMARY OF THE INVENTION

[0015] The present invention is a method and system for mapping the flow data that will be the subject of wavelet transform equations to a system comprising adders, subtractors, multipliers and/or dividers to perform the mathematical functions set forth in the particular wavelet transform. A shift register is utilized to continually flow the individual data bytes of the data file being processed through the system. By mapping these hardware components to perform the computations involved in wavelet transform equations, an entire data file (e.g., a digital image) can be processed serially as it flows through the shift register triggered by a clock pulse to the shift register. This eliminates the need for process computers and storage for conducting the multiple read-process-write steps required of prior art wavelet transform processors.

BRIEF DESCRIPTION OF THE DRAWINGS

[0016]FIG. 1 illustrates a system of the prior art for processing and computing wavelet transforms;

[0017]FIGS. 2 and 3 illustrate a first embodiment of the present invention relative to a 5-3 wavelet transform; and

[0018]FIG. 4 illustrates a system for computing a 5-4 wavelet transform on data.

DETAILED DESCRIPTION

[0019]FIG. 1 illustrates a system of the prior art for processing and computing wavelet transforms. Referring to FIG. 1, a system 100 comprises a processor 102 and a memory 104. Image data is input into the processor 102 and is divided into sub-blocks of data using known processing techniques. The bytes of data making up the sub-blocks are then subjected to wavelet transform algorithms. Specifically, the first sequence of data bytes needed for the particular wavelet transform are identified, and the first mathematical operation of the wavelet transform is performed, and the result stored in memory 104. Then the next mathematical operation of the wavelet transform is performed, and that result is also stored in memory 104. Each mathematical operation set forth in the wavelet transform is performed in this manner until the wavelet transform equation has been performed to its final result for the first sequence of data bytes.

[0020] Once all of the mathematical functions of the wavelet transform equation have been performed on the first sequence of data bytes, the next data-byte sequence is obtained (e.g., by moving the first byte in the first sequence out of consideration, moving each remaining byte up one position in the sequence, and inserting the next byte at the end of the sequence), and the process repeats on the new data-byte sequence. This process continues until the sub-block has been processed; then the next sub-block is processed identically, until the entire block has been processed.

[0021] As discussed above, the prior art method requires significant computing power, memory, and time. Data is continually being read, processed, and stored (written) to memory. Results from one computation may be required to perform another computation, requiring that the result itself be read and used for the new computation. This constant read-process-write procedure is time-consuming and requires significant computational power and storage capability.

[0022]FIGS. 2 and 3 illustrate a first embodiment of the present invention relative to a 5-3 wavelet transform. For simplicity, the 5-3 wavelet transform equation is repeated below: $\left\lbrack {\left( {\left( {C_{1} + C_{2}} \right) \times \frac{1}{2}} \right) - C_{3}} \right\rbrack + \left\lbrack {\left( {\left( {C_{3} + C_{4}} \right) \times \frac{1}{2}} \right) - C_{5}} \right\rbrack$

[0023] Referring to FIG. 2, a shift register 201, comprising a series of shift register bins 202, 204, 206, 208, and 210 is periodically clocked in a well-known manner so that, with each clock pulse a new 5-byte sample (in this example) is input to the shift register bins. The period of the clock pulse should be of a size sufficient for each of the math processors (described below) to have time to perform their particular process on the data in the shift register and output a result. In FIG. 2, the first five data bytes of a sample are shown in bins 202, 204, 206, 208, and 210, respectively. With each clock pulse a new 5-byte sequence is made available, so that at the first clock pulse, byte one in bin 202 will be “pushed out” of the shift register 201, byte two in bin 204 will be pushed to bin 202; byte three in bin 206 will be pushed to bin 204; byte four in bin 208 will be pushed to bin 206; byte five in bin 210 will be pushed to bin 208; and the next bit in order (byte six) will move into bin 210, as shown in FIG. 3.

[0024] As shown in the example of FIGS. 2 and 3, the bytes in each bin are input to math processors 212, 214, 216, 218, and 220. Math processors 212, 214, 216, 218, and 220 can comprise any known device for performing a mathematical function on input data, such as adders, subtractors, multipliers, or dividers or any combinations thereof. Logic devices for performing these and other mathematical functions are very well known and are not discussed further herein. In this example, math processors 212, 214, and 220 are adders and math processors 216 and 218 are subtractors.

[0025] The add/subtract operations performed by the math processors are carried out on the bytes in each bin. More specifically, in FIG. 2, the byte in bin 202 is added to the byte in bin 204 by math processor 212; and the byte in bin 206 is added to the byte in bin 208 by math processor 214.

[0026] The output of math processor 212 is shown as being multiplied by ½ (via multiplier 213) prior to being input to math processor 216. Thus, the byte in bin 206 is subtracted, by math processor 216, from one-half of the sum of bytes 202 and 204 that is output from math processor 212 which satisfies an analogous portion of the 5-3 wavelet transform equation.

[0027] Similarly, the sum from math processor 214 is multiplied by ½ (via multiplier 215) and this product is input to math processor 218. The data byte in bin 210 is then subtracted from this product by math processor 218. Finally, the result of the subtraction process from math processor 216 is added to the result from the subtraction process of math processor 218 by math processor 220, thereby outputting the 5-3 wavelet transform of the byte in bin 202.

[0028] Thus, as can be seen, the configuration illustrated in FIGS. 2 and 3 performs, in serial fashion, byte by byte, the operations of the equation of the 5-3 wavelet transform. No substantial memory is involved and there is no requirement for computer processors to perform the computation. The computation is done simply, quickly, and inexpensively using the shift register and math processor architecture shown in FIGS. 2 and 3, and there is no need to divide the data file being processed into sub-blocks.

[0029]FIG. 4 illustrates a system for computing a 5-4 wavelet transform on data, which will solve the 5-4 wavelet transform equation: $\frac{{\left\lbrack {{C_{3} \times 2} -} \right\rbrack \left\lbrack {\left( {\left( {C_{1} + C_{2}} \right) \times \frac{1}{2}} \right) - C_{3}} \right\rbrack} + \left\lbrack {\left( {\left( {C_{3} + C_{4}} \right) \times \frac{1}{2}} \right) - C_{5}} \right\rbrack}{2}$

[0030] The process is essentially the same as that of FIG. 2, with the exception that an additional stage (math processor 422), and multipliers 405 and 421, are added. Multiplier 405 multiplies the byte in byte bin 206 by two, and multiplier 421 multiplies the output from math processor 220 by one-half. These two products are subjected to a subtraction process by math processor 422 to perform the additional mathematical functions of the 5-4 wavelet transform equation.

[0031] The hardware components used to perform the above-described functions can be any known hardware components that can perform the adding, subtracting, and multiplying functions described above. The shift register can be, for example, SN7474 shift registers manufactured by Motorola or Texas Instruments; LSI Logic also makes shift registers that can perform the above-described functions. The math processor, as noted above, can be any known device (e.g., logical adders, subtractors, multipliers or dividers) capable of carrying out the function required by the wavelet transform.

[0032] In FIGS. 2-4, hardware multipliers are shown to perform the halving or doubling functions described above (or any other mathematical multiplication function required of a wavelet transform equation). As an alternative, in a preferred embodiment, adjacent math processors (e.g., math processor 212 and math processor 216) can be hardwired so that the output from the first math processor (e.g., math processor 212) is wired to the second math processor (e.g., math processor 216) such that the input to the second math processor is shifted by one decimal place, thereby halving or doubling (depending upon which direction the decimal place is moved, as is well-known) the value output from the first math processor. This will reduce the number of hardware elements required and thus reduce cost and size requirements. Any known math processing hardware/method can be used as long as it performs the function required of the equation.

[0033] The above-described method can be used to perform the mathematical functions of any wavelet transform equation and it is understood that one of ordinary skill in the art can easily determine the exact mapping needed based on a simple analysis of the particular equation in view of the invention disclosed above and claimed herein.

[0034] While there has been described herein the principles of the invention, it is to be understood by those skilled in the art that this description is made only by way of example and not as a limitation to the scope of the invention. Accordingly, it is intended by the appended claims, to cover all modifications of the invention which fall within the true spirit and scope of the invention. 

1. A method for computing the wavelet transform of a plurality of data bytes in a data file, comprising the steps of: identifying all mathematical operations to be performed by the wavelet transform; identifying math processors to perform each of said mathematical operations; and mapping said plurality of data bytes to said math processors to perform said mathematical operations on each of said data bytes.
 2. The method as set forth in claim 1, wherein said mathematical operations are performed periodically on a subset of said plurality of data bytes input to a shift register, the period being based on a clock pulse, further comprising the steps of: clocking said shift register with said clock pulse, so that upon the occurrence of each clock pulse, a new subset of said plurality of data bytes has said mathematical operations performed thereon.
 3. The method as set forth in claim 2, wherein said subset of said plurality of data bytes are in sequential order based on their position in said data file, and wherein said shift register includes a first bin, a second bin, and a last bin, and wherein upon the occurrence of each clock pulse: the data byte in said first bin is removed from said shift register; each remaining data byte in said shift register is moved up in order so that said data byte in said second bind is moved to the first bin in said shift register; and the next data byte, based on the position of said data bytes in said data file, is moved into the last bin of said shift register.
 4. The method as set forth in claim 3, wherein said math processors include adders and subtractors.
 5. The method as set forth in claim 4, wherein said math processors further comprise means for performing multiplication and division on outputs thereof.
 6. The method as set forth in claim 5, wherein said means for performing multiplication and division comprises a multiplier/divider.
 7. The method as set forth in claim 5, wherein said means for performing multiplication and division comprises configuring the connections of said multiplied/divided outputs to subsequent math processors to effect said multiplication/division function.
 8. A system for computing the wavelet transform of a plurality of data bytes in a data file, comprising: a shift register for receiving a subset of said plurality of data bytes; a plurality of math processors operatively coupled to said shirt register and mapped to perform each mathematical operation of said wavelet transform; and a clock signal coupled to said shift register; whereby upon the occurrence of each clock signal, a new subset of said data bytes is presented to said system for use in computing the wavelet transform, and whereby the number of clock pulses required to perform said computation on each of said plurality of data bytes is equal to the total number of data bytes in said data file.
 9. The system of claim 8, wherein said math processors include adders and subtractors.
 10. The system of claim 9, wherein said math processors further comprise means for performing multiplication and division on outputs thereof.
 11. The system of claim 10, wherein said means for performing multiplication and division comprises a multiplier/divider.
 12. The system of claim 11, wherein said means for performing multiplication and division comprises configuring the connections of said multiplied/divided outputs to subsequent math processors to effect said multiplication/division function.
 13. A system for computing the wavelet transform of a plurality of data bytes in a data file, comprising: means for identifying all mathematical operations to be performed by the wavelet transform; means for identifying math processors to perform each of said mathematical operations; and means for mapping said plurality of data bytes to said math processors to perform said mathematical operations on each of said data bytes.
 14. The system of claim 13, wherein said mathematical operations are performed periodically on a subset of said plurality of data bytes input to a shift register, the period being based on a clock pulse, said system further comprising: means for clocking said shift register with said clock pulse, so that upon the occurrence of each clock pulse, a new subset of said plurality of data bytes has said mathematical operations performed thereon.
 15. The system of claim 14, wherein said subset of said plurality of data bytes are in sequential order based on their position in said data file, and wherein said shift register includes a first bin, a second bin, and a last bin, said system comprising: means for, upon the occurrence of each clock pulse: removing the data byte in said first bin from shift register; moving each remaining data byte in said shift register up in order so that said data byte in said second bin is moved to the first bin in said shift register; and moving the next data byte, based on the position of said data bytes in said data file, into the last bin of said shift register.
 16. The system of claim 15, wherein said math processors include adders and subtractors.
 17. The system of claim 16, wherein said math processors further comprise means for performing multiplication and division on outputs thereof.
 18. The system of claim 17, wherein said means for performing multiplication and division comprises a multiplier/divider.
 19. The system of claim 17, wherein said means for performing multiplication and division comprises configuring the connections of said multiplied/divided outputs to subsequent math processors to effect said multiplication/division function. 